Search CORE

5 research outputs found

No-Regret Online Reinforcement Learning with Adversarial Losses and Transitions

Author: Chang William
Jin Tiancheng
Liu Junyan
Luo Haipeng
Rouyer Chloé
Wei Chen-Yu
Publication venue
Publication date: 30/05/2023
Field of study

Existing online learning algorithms for adversarial Markov Decision Processes achieve

{O}(\sqrt{T})

regret after

T

rounds of interactions even if the loss functions are chosen arbitrarily by an adversary, with the caveat that the transition function has to be fixed. This is because it has been shown that adversarial transition functions make no-regret learning impossible. Despite such impossibility results, in this work, we develop algorithms that can handle both adversarial losses and adversarial transitions, with regret increasing smoothly in the degree of maliciousness of the adversary. More concretely, we first propose an algorithm that enjoys

\widetilde{{O}}(\sqrt{T} + C^{\textsf{P}})

regret where

C^{\textsf{P}}

measures how adversarial the transition functions are and can be at most

{O}(T)

. While this algorithm itself requires knowledge of

C^{\textsf{P}}

, we further develop a black-box reduction approach that removes this requirement. Moreover, we also show that further refinements of the algorithm not only maintains the same regret bound, but also simultaneously adapts to easier environments (where losses are generated in a certain stochastically constrained manner as in Jin et al. [2021]) and achieves

\widetilde{{O}}(U + \sqrt{UC^{\textsf{L}}} + C^{\textsf{P}})

regret, where

U

is some standard gap-dependent coefficient and

C^{\textsf{L}}

is the amount of corruption on losses.Comment: 66 page

arXiv.org e-Print Archive

Tsallis-INF for decoupled exploration and exploitation in multi-armed bandits

Author: Rouyer Chloé
Seldin Yevgeny
Publication venue: PMLR
Publication date: 01/01/2020
Field of study

Copenhagen University Research Information System

An algorithm for stochastic and adversarial bandits with switching costs

Author: Cesa-Bianchi Nicolò
Rouyer Chloé
Seldin Yevgeny
Publication venue: PMLR
Publication date: 01/01/2021
Field of study

We propose an algorithm for stochastic and adversarial multiarmed bandits with switching costs, where the algorithm pays a price

\lambda

every time it switches the arm being played. Our algorithm is based on adaptation of the Tsallis-INF algorithm of Zimmert and Seldin (2021) and requires no prior knowledge of the regime or time horizon. In the oblivious adversarial setting it achieves the minimax optimal regret bound of

O\big((\lambda K)^{1/3}T^{2/3} + \sqrt{KT}\big)

, where

T

is the time horizon and

K

is the number of arms. In the stochastically constrained adversarial regime, which includes the stochastic regime as a special case, it achieves a regret bound of

O\left(\big((\lambda K)^{2/3} T^{1/3} + \ln T\big)\sum_{i \neq i^*} \Delta_i^{-1}\right)

, where

\Delta_i

are the suboptimality gaps and

i^*

is a unique optimal arm. In the special case of

\lambda = 0

(no switching costs), both bounds are minimax optimal within constants. We also explore variants of the problem, where switching cost is allowed to change over time. We provide experimental evaluation showing competitiveness of our algorithm with the relevant baselines in the stochastic, stochastically constrained adversarial, and adversarial regimes with fixed switching cost

arXiv.org e-Print Archive

Copenhagen University Research Information System

Correlation between DNA Methylation and cell proliferation identifies new candidate predictive markers in meningioma

Author: Battaglia-Hsu Shyue-Fang
Divoux Marion
Gauchotte Guillaume
Godfraind Catherine
Hergalant Sébastien
Lacomme Stéphanie
Pouget Celso
Rech Fabien
Rouyer Pierre
Saurel Chloé
Publication venue: 'MDPI AG'
Publication date: 01/12/2022
Field of study

International audienceMeningiomas are the most common primary tumors of the central nervous system. Based on the 2021 WHO classification, they are classified into three grades reflecting recurrence risk and aggressiveness. However, the WHO’s histopathological criteria defining these grades are somewhat subjective. Together with reliable immunohistochemical proliferation indices, other molecular markers such as those studied with genome-wide epigenetics promise to revamp the current prognostic classification. In this study, 48 meningiomas of various grades were randomly included and explored for DNA methylation with the Infinium MethylationEPIC microarray over 850k CpG sites. We conducted differential and correlative analyses on grade and several proliferation indices and markers, such as mitotic index and Ki-67 or MCM6 immunohistochemistry. We also set up Cox proportional hazard models for extensive associations between CpG methylation and survival. We identified loci highly correlated with cell growth and a targeted methylation signature of regulatory regions persistently associated with proliferation, grade, and survival. Candidate genes under the control of these regions include SMC4, ESRRG, PAX6, DOK7, VAV2, OTX1, and PCDHA-PCDHB-PCDHG, i.e., the protocadherin gene clusters. This study highlights the crucial role played by epigenetic mechanisms in shaping dysregulated cellular proliferation and provides potential biomarkers bearing prognostic and therapeutic value for the clinical management of meningioma

HAL Clermont Université

Directory of Open Access Journals

PubMed Central

La qualité environnementale en milieu urbain

Ce numéro de la revue Méditerranée rassemble des études en géographie, écologie, urbanisme, paysage… qui interrogent la qualité environnementale du cadre de vie dans 15 villes d’Europe et du Bassin méditerranéen. Il propose de mettre en question les approches normatives qui sous-tendent aujourd'hui les produits immobiliers ou urbanistiques qui mobilisent la notion de « haute qualité environnementale », ainsi que le discours consensuel sur la « ville durable »… Le numéro vise d’abord à enrichir les critères d'évaluation classiques de la QE urbaine qui habituellement procèdent « top down » et par indices ou combinaisons multicritères. Pour ce faire, les articles mettent en relation les paramètres matériels et mesurables de la qualité environnementale des aménagements urbains (par exemple : végétation urbaine, bruit/calme, pollution, confort thermique etc.) avec les perceptions sensibles et évaluations subjectives des habitants, ainsi qu'avec les enjeux politico-économiques de la relation à l’environnement. L'objectif est de comprendre la construction sociale de « systèmes de valeurs environnementales » dans leur globalité, en tenant compte de leurs enjeux politiques, des rapports socio-économiques, et des pratiques, représentations, ou revendications territoriales des habitants dans les contextes analysés

OpenEdition